tmap: How many towns with official bilingual names in Belgium?It seems that we Belgians just love confusing foreigners…
Imagine wanting to take a train to Mons from Ghent but there only is one driving to Bergen. Or driving south with a GPS telling you to folloz the direcion of Liège, but as long as you are on Flemish territory the direction signs will indicate Luik instead.
Mons/Bergen, Liège/Luik, Ypres/Ieper… those names refer to exactly the same city - one of them is the official French name, the other one the official Dutch one.
Two week ago, I heard again a story from foreigners who got very confused, and I realized I have no idea how many towns/cities we have like this. Sounds like a perfect time to work on my spatial object skills!
I found everything I needed on this website from the Belgian government. The data is from early 2017.
Starting by loading the packages needed:
#packages for the data exploration
library(tidyverse)
library(readxl)
library(ggplot2)
#packages for the maps
library(sp)
library(tmap)
library(viridisLite)
library(leaflet)
library(BelgiumMaps.StatBel)
Importing the data:
#Importing the data
raw_data <- read_excel("TF_SOC_POP_STRUCT_2017_tcm325-283761.xlsx", sheet=1)
The data contains a lot of unneeded administrative data, and I wanted to rename some columns to English.
#Keeping only the variables needed
data <- raw_data %>%
select(contains("MUNTY"), TX_RGN_DESCR_NL, CD_SEX, TX_NATLTY_NL, TX_CIV_STS_NL, CD_AGE, MS_POPULATION)
colnames(data) <- c("REFNIS", "TownNL", "TownFR", "Region", "Sex", "Nationality", "MaritalStatus", "Age", "Population")
#Translating Region names to English
data$Region <- data$Region %>%
str_replace("Vlaams Gewest", "Flanders") %>%
str_replace("Waals Gewest", "Wallonia") %>%
str_replace("Brussels Hoofdstedelijk Gewest", "Brussels agglomeration")
Additionnally, the data does not have one population count but is divided in demographic subsets.
If I would want to know how many people live in my home town, which have the same gender, age, nationality and marital status, I could (26 by the way). But since that’s not really what I’m after, I used dplyr to create a summary population table, and immediately added a new boolean column to compare Town Names in Flemish and French.
#Creating a dataframe with total population for each town, and adding a column to see whether they have the same name
popdata1 <- data %>%
group_by(TownNL, TownFR, Region, REFNIS) %>%
summarise(population=sum(Population)) %>%
arrange(desc(population)) %>%
mutate(SameName = TownNL==TownFR) %>%
ungroup()
Quite quickly an issue presented itself though: while browsing through some breakouts, I noticed that some town names are annotated with their district. Beveren for instance is called the same in Flemish or French, but its district got translated, and it was flagged as town with a different name in Flemish or French.
#Noticing an issue:
popdata1%>%
filter(Region=="Flanders") %>%
filter(!SameName) %>%
slice (11:13)
## # A tibble: 3 x 6
## TownNL TownFR Region REFNIS
## <chr> <chr> <chr> <chr>
## 1 Beveren (Sint-Niklaas) Beveren (Saint-Nicolas) Flanders 46003
## 2 Dendermonde Termonde Flanders 42006
## 3 Vilvoorde Vilvorde Flanders 23088
## # ... with 2 more variables: population <dbl>, SameName <lgl>
To get rid of the districts, I cleaned out any word pattern between brackets, and re-generated a boolean column DiffName to see whether the town names are different.
#Removing the sectors between brackets
popdata <- popdata1
popdata$TownNL <- str_replace(popdata$TownNL, pattern="\\s\\(.+\\)", replacement="")
popdata$TownFR <- str_replace(popdata$TownFR, pattern="\\s\\(.+\\)", replacement="")
#Reassessing whether the names are the same
popdata <- popdata %>%
mutate(DiffName = TownNL != TownFR) %>%
select(TownNL, TownFR, DiffName, population, Region, REFNIS)
There are 95 towns/cities with two different official names, which is 0.1612903 of the total amount of towns. Contrary to what some people assume, it’s more or less similar in both regions: 13% of Flemish towns have an official French name, 16% of Walloon towns have an official Flemish name on top. Only in Brussels, an official bilingual region, as a much higher percentage of ’double name’s.
#How many have exactly the same name?
popdata %>%
summarise(NTowns_DiffName = sum(popdata$DiffName), Prop_DiffName=mean(popdata$DiffName))%>%
knitr::kable()
| NTowns_DiffName | Prop_DiffName |
|---|---|
| 95 | 0.1612903 |
#by region
popdata %>%
group_by(Region) %>%
summarise(NTowns=n(), N_SameName=n()-sum(DiffName), N_DiffName=sum(DiffName),
Prop_SameName =1-round(mean(DiffName),2), Prop_DiffName=round(mean(DiffName),2))%>%
knitr::kable()
| Region | NTowns | N_SameName | N_DiffName | Prop_SameName | Prop_DiffName |
|---|---|---|---|---|---|
| Brussels agglomeration | 19 | 6 | 13 | 0.32 | 0.68 |
| Flanders | 308 | 269 | 39 | 0.87 | 0.13 |
| Wallonia | 262 | 219 | 43 | 0.84 | 0.16 |
Using tmap I created two first maps: one that shows the general regions in Belgium, and a second comparative one highlighting just the towns that have two official town names.
#Importing SPdataframe for Belgium
data("BE_ADMIN_MUNTY", package="BelgiumMaps.StatBel")
#creating a Region2 for making the second plot highlighting only DiffName towns
popdatamap <- popdata %>%
mutate(Region2 = ifelse(DiffName==TRUE, Region, NA))
#Merging my 2017 data with the SPdataframe
mapdata <- merge(BE_ADMIN_MUNTY, popdatamap, by.x = "CD_MUNTY_REFNIS", by.y = "REFNIS")
#palette generation
virpalette <- rev(viridis(3))
#Plot different regions
regionplot<- tm_shape(mapdata) +
tm_fill(col="Region", palette=virpalette,
title = "Regions in Belgium")+
tm_polygons()+
tm_layout(legend.position = c("left", "bottom"))
#Plot to show those with differnet name by region
nameplot <- tm_shape(mapdata) +
tm_fill(col="Region2", palette=virpalette,
colorNA = "gray90", textNA="Same name",
title = "Different regional town names",legend.position = c("left", "bottom" ))+
tm_polygons()+
tm_layout(legend.position = c("left", "bottom"))
#Show both plots next to each other
tmap_arrange(regionplot, nameplot)
First of all, for people not familair with Belgium: you see our basic regions in the left plot * The yellow dot in the middle is the Brussels agglomeration, officially bilingual * The north in green is Flanders where the official language is Dutch (of Flemish as we call it) * The south in purple is Wallonia where the official language is French * The divide between green en purple is called the language border… * To make things even more complicated, some towns in Flanders or Wallonia have a special status: they have “language facilities”. To make something complicated very simple: they are bilingual without being bilingual.
The image on the right just shows all the towns with two official town names. Seeing a higher concentration of these towns around the language border is not a complete surprise, but it does not explain the majority of towns.
## Warning: One tm layer group has duplicated layer types, which are omitted.
## To draw multiple layers of the same type, use multiple layer groups (i.e.
## specify tm_shape prior to each of them).
## Warning: One tm layer group has duplicated layer types, which are omitted.
## To draw multiple layers of the same type, use multiple layer groups (i.e.
## specify tm_shape prior to each of them).
## Warning: One tm layer group has duplicated layer types, which are omitted.
## To draw multiple layers of the same type, use multiple layer groups (i.e.
## specify tm_shape prior to each of them).
## Warning: One tm layer group has duplicated layer types, which are omitted.
## To draw multiple layers of the same type, use multiple layer groups (i.e.
## specify tm_shape prior to each of them).
In the above table it was obvious that the Brussel’s region has a much higher share of towns with two offical names: 68% versus the country average of 16%. Given Brussels status as bilingual that should not come as a surprise. I was actually more surprised to realize that there are still 6 that only have their original name only. Ganshoren for instance is a typical Flemish name that is not that easy to pronounce in French.
#Checking the data on Brussels
popdata %>%
filter(Region=="Brussels agglomeration") %>%
summarise(NTowns=n(), N_SameName=n()-sum(DiffName), N_DiffName=sum(DiffName),
Prop_SameName =1-round(mean(DiffName),2), Prop_DiffName=round(mean(DiffName),2))%>%
knitr::kable()
| NTowns | N_SameName | N_DiffName | Prop_SameName | Prop_DiffName |
|---|---|---|---|---|
| 19 | 6 | 13 | 0.32 | 0.68 |
#List of names for Brussels
popdata %>%
filter(Region=="Brussels agglomeration") %>%
group_by(DiffName) %>%
arrange(desc(DiffName), desc(population)) %>%
knitr::kable()
| TownNL | TownFR | DiffName | population | Region | REFNIS |
|---|---|---|---|---|---|
| Brussel | Bruxelles | TRUE | 176545 | Brussels agglomeration | 21004 |
| Schaarbeek | Schaerbeek | TRUE | 133042 | Brussels agglomeration | 21015 |
| Sint-Jans-Molenbeek | Molenbeek-Saint-Jean | TRUE | 96629 | Brussels agglomeration | 21012 |
| Elsene | Ixelles | TRUE | 86244 | Brussels agglomeration | 21009 |
| Ukkel | Uccle | TRUE | 82307 | Brussels agglomeration | 21016 |
| Vorst | Forest | TRUE | 55746 | Brussels agglomeration | 21007 |
| Sint-Lambrechts-Woluwe | Woluwe-Saint-Lambert | TRUE | 55216 | Brussels agglomeration | 21018 |
| Sint-Gillis | Saint-Gilles | TRUE | 50471 | Brussels agglomeration | 21013 |
| Sint-Pieters-Woluwe | Woluwe-Saint-Pierre | TRUE | 41217 | Brussels agglomeration | 21019 |
| Oudergem | Auderghem | TRUE | 33313 | Brussels agglomeration | 21002 |
| Sint-Joost-ten-Node | Saint-Josse-ten-Noode | TRUE | 27115 | Brussels agglomeration | 21014 |
| Watermaal-Bosvoorde | Watermael-Boitsfort | TRUE | 24871 | Brussels agglomeration | 21017 |
| Sint-Agatha-Berchem | Berchem-Sainte-Agathe | TRUE | 24701 | Brussels agglomeration | 21003 |
| Anderlecht | Anderlecht | FALSE | 118241 | Brussels agglomeration | 21001 |
| Jette | Jette | FALSE | 51933 | Brussels agglomeration | 21010 |
| Etterbeek | Etterbeek | FALSE | 47414 | Brussels agglomeration | 21005 |
| Evere | Evere | FALSE | 40394 | Brussels agglomeration | 21006 |
| Ganshoren | Ganshoren | FALSE | 24596 | Brussels agglomeration | 21008 |
| Koekelberg | Koekelberg | FALSE | 21609 | Brussels agglomeration | 21011 |
#Adding a column to note down the reason for different names
reason_BXL <- popdata %>%
filter(Region=="Brussels agglomeration") %>%
filter(DiffName) %>%
mutate(Reason = "Brussels")
Cities are generally more important and I would have guessed that most of our cities have two official names. By just looking at the difference in average population between towns that have two names (DiffName==TRUE) and those who don’t, there clearly is a skew towards higher population town.
A quick plot in ggplot confirms this to be true: grey shows all the towns in Belgium according to their population size on a logarithmic scale. I coloured those who have two names in green.
popdata %>%
group_by(DiffName) %>%
summarise(mean=mean(population), median=median(population))
## # A tibble: 2 x 3
## DiffName mean median
## <lgl> <dbl> <dbl>
## 1 FALSE 14744.06 11383
## 2 TRUE 42510.78 24701
#Plotting average town size of small and larger towns
ggplot()+
geom_histogram(data=popdata, aes(x=population), fill="grey", alpha=0.6)+
geom_histogram(data=subset(popdata, DiffName==TRUE), aes(x=population), fill="cadetblue4", alpha=1)+
scale_x_log10()+
labs(x= "Population", y="Number of towns", title="Size of towns with two official names amongst all towns in Belgium")
I took a shortcut to define our cities: the 10% highest populated towns.
#10% largest towns and cities in Belgium
quantile(popdata$population, probs = seq(from = 0, to = 1, by = .1))
0% 10% 20% 30% 40% 50% 60% 70%
89.0 4372.2 6341.8 8308.4 10268.4 12123.0 14649.6 18473.6
80% 90% 100%
23259.6 34189.8 520504.0
#Proportion of Cities with different names
popdata %>%
filter(population > 34000) %>%
summarise(NTowns=n(), N_SameName=n()-sum(DiffName), N_DiffName=sum(DiffName),
Prop_SameName =1-round(mean(DiffName),2), Prop_DiffName=round(mean(DiffName),2))%>%
knitr::kable()
| NTowns | N_SameName | N_DiffName | Prop_SameName | Prop_DiffName |
|---|---|---|---|---|
| 60 | 27 | 33 | 0.45 | 0.55 |
#Adding a reason column
reason_city <- popdata %>%
filter(population > 34000) %>%
filter(Region != "Brussels agglomeration") %>%
filter(DiffName) %>%
mutate(Reason = "City")
After World War I, the peace treaty of Versailles listed the annexation of 9 German towns into Belgium as war compensation. They make up our third language region as German is still their main language today.
Given that German and Dutch are both German langauges and have a lot of similarities it would make sense that the Flemish would refer to the German town names, while the French have changed some of them.
#Listing the German communes and the two additional towns with german facilities
germanspeaking <- c("Eupen", "Kelmis", "Lontzen", "Raeren", "Amel", "Büllingen",
"Burg-Reuland", "Bütgenbach", "Sankt Vith", "Malmedy", "Weismes")
#Proportion of Cities with different names
popdata %>%
filter(TownNL %in% germanspeaking) %>%
summarise(NTowns=n(), N_SameName=n()-sum(DiffName), N_DiffName=sum(DiffName),
Prop_SameName =1-round(mean(DiffName),2), Prop_DiffName=round(mean(DiffName),2))%>%
knitr::kable()
| NTowns | N_SameName | N_DiffName | Prop_SameName | Prop_DiffName |
|---|---|---|---|---|
| 11 | 5 | 6 | 0.45 | 0.55 |
#German towns with two official names
popdata %>%
filter(TownNL %in% germanspeaking) %>%
filter(DiffName==TRUE) %>%
print(n=nrow(.))%>%
knitr::kable()
TownNL TownFR DiffName population Region REFNIS
<chr> <chr> <lgl> <dbl> <chr> <chr>
1 Kelmis La Calamine TRUE 10964 Wallonia 63040 2 Sankt Vith Saint-Vith TRUE 9661 Wallonia 63067 3 Weismes Waimes TRUE 7493 Wallonia 63080 4 Bütgenbach Butgenbach TRUE 5583 Wallonia 63013 5 Amel Amblève TRUE 5523 Wallonia 63001 6 Büllingen Bullange TRUE 5489 Wallonia 63012
| TownNL | TownFR | DiffName | population | Region | REFNIS |
|---|---|---|---|---|---|
| Kelmis | La Calamine | TRUE | 10964 | Wallonia | 63040 |
| Sankt Vith | Saint-Vith | TRUE | 9661 | Wallonia | 63067 |
| Weismes | Waimes | TRUE | 7493 | Wallonia | 63080 |
| Bütgenbach | Butgenbach | TRUE | 5583 | Wallonia | 63013 |
| Amel | Amblève | TRUE | 5523 | Wallonia | 63001 |
| Büllingen | Bullange | TRUE | 5489 | Wallonia | 63012 |
#Adding a reason column
reason_german <- popdata %>%
filter(TownNL %in% germanspeaking) %>%
filter(DiffName) %>%
mutate(Reason = "German region")
Always a topic for debate in Belgium: the towns with official language facilities. These are towns that belong to one region but they have some degree of bilingual facilities (it’s complicated!).
#Listing all towns with language facilities
faciliteiten <- c("Bever", "Drogenbos", "Herstappe", "Kraainem", "Linkebeek",
"Mesen", "Ronse", "Sint-Genesius-Rode", "Spiere-Helkijn",
"Voeren", "Wemmel", "Wezembeek-Oppem", "Edingen",
"Komen-Waasten", "Moeskroen", "Vloesberg")
#Proportion of Cities with different names
popdata %>%
filter(TownNL %in% faciliteiten) %>%
summarise(NTowns=n(), N_SameName=n()-sum(DiffName), N_DiffName=sum(DiffName),
Prop_SameName =1-round(mean(DiffName),2), Prop_DiffName=round(mean(DiffName),2))%>%
knitr::kable()
| NTowns | N_SameName | N_DiffName | Prop_SameName | Prop_DiffName |
|---|---|---|---|---|
| 16 | 6 | 10 | 0.38 | 0.62 |
#Which towns have different names?
popdata %>%
filter(TownNL %in% faciliteiten) %>%
filter(DiffName==TRUE) %>%
print(n=nrow(.))%>%
knitr::kable()
TownNL TownFR DiffName population Region
<chr> <chr> <lgl> <dbl> <chr>
1 Moeskroen Mouscron TRUE 57773 Wallonia 2 Ronse Renaix TRUE 26092 Flanders 3 Sint-Genesius-Rode Rhode-Saint-Genèse TRUE 18231 Flanders 4 Komen-Waasten Comines-Warneton TRUE 18102 Wallonia 5 Edingen Enghien TRUE 13563 Wallonia 6 Voeren Fourons TRUE 4129 Flanders 7 Vloesberg Flobecq TRUE 3426 Wallonia 8 Bever Biévène TRUE 2160 Flanders 9 Spiere-Helkijn Espierres-Helchin TRUE 2142 Flanders 10 Mesen Messines TRUE 1049 Flanders # … with 1 more variables: REFNIS
| TownNL | TownFR | DiffName | population | Region | REFNIS |
|---|---|---|---|---|---|
| Moeskroen | Mouscron | TRUE | 57773 | Wallonia | 54007 |
| Ronse | Renaix | TRUE | 26092 | Flanders | 45041 |
| Sint-Genesius-Rode | Rhode-Saint-Genèse | TRUE | 18231 | Flanders | 23101 |
| Komen-Waasten | Comines-Warneton | TRUE | 18102 | Wallonia | 54010 |
| Edingen | Enghien | TRUE | 13563 | Wallonia | 55010 |
| Voeren | Fourons | TRUE | 4129 | Flanders | 73109 |
| Vloesberg | Flobecq | TRUE | 3426 | Wallonia | 51019 |
| Bever | Biévène | TRUE | 2160 | Flanders | 23009 |
| Spiere-Helkijn | Espierres-Helchin | TRUE | 2142 | Flanders | 34043 |
| Mesen | Messines | TRUE | 1049 | Flanders | 33016 |
#Adding a reason column
reason_facilities <- popdata %>%
filter(TownNL %in% faciliteiten) %>%
filter(DiffName) %>%
anti_join(reason_city) %>%
mutate(Reason = "Language facilities")
The language border plays a big role obviously, so gathered any other towns along the border even if they have no language facilities. In many of these cases,throughout history towns have changed which region they belong to.
Lastly, I wanted to make an “other reason” category, and bind all the reasons to my main data.
#Language border
language_border <- c("Heuvelland", "Komen-Waasten", "Mesen", "Menen", "Kortrijk", "Moeskroen", "Spiere-Helkijn",
"Ronse", "Elzele", "Vloesberg", "Lessen", "Geraardsbergen", "Bever", "Opzullik",
"Edingen", "Rebecq", "Tubeke", "Kasteelbrakel", "Halle", "Sint-Genesius-Rode",
"Eigenbrakel", "Terhulpen", "Waver", "Graven", "Bevekom", "Geldenaken", "Tienen", "Lijsem",
"Hannuit", "Borgworm", "Oerle", "Tongeren", "Bitsingen", "Voeren", "Wezet")
reason_langborder <- popdata %>%
filter(TownNL %in% language_border) %>%
filter(DiffName) %>%
anti_join(reason_city) %>%
anti_join(reason_facilities) %>%
mutate(Reason = "Language border")
#Other
reason_other <- popdata %>%
filter(DiffName) %>%
anti_join(reason_city) %>%
anti_join(reason_BXL) %>%
anti_join(reason_german) %>%
anti_join(reason_facilities) %>%
anti_join(reason_langborder) %>%
mutate(Reason = "Other")
#Merging reasons
reason <- bind_rows(reason_BXL, reason_city, reason_german, reason_facilities, reason_langborder, reason_other)
#Searching for duplicates before join
reason %>%
group_by(REFNIS) %>%
filter(n() > 1)
#Joining
popdata_reason <- left_join(popdata, reason)
popdata_reason <- popdata_reason %>%
mutate(Region2 = ifelse(DiffName==TRUE, Region, NA))
A quick reason map:
The Brussels and German region are pretty obvious dots in the map, along with the language border and facility towns. Large cities are scattered across the whole of Belgium and many of the unidentied scattered dots also represent smaller cities (like Aarlen/Arlon or Temse/Tamise).
There is another group of towns close together that have two names even if
I wanted to bring it all together in one final interactive map:
## Warning: One tm layer group has duplicated layer types, which are omitted.
## To draw multiple layers of the same type, use multiple layer groups (i.e.
## specify tm_shape prior to each of them).